Deep learning based video-related super-resolution technique: a survey
نویسندگان
چکیده
视频超分辨率技术在卫星遥感侦测、视频监控和医疗影像等方面发挥着关键作用,在各领域具有广阔的应用前景,受到广泛关注,但传统的视频超分辨率算法具有一定局限性。随着深度学习技术的愈发成熟,基于深度神经网络的超分辨率算法在性能上取得了长足进步。充分融合视频时空信息可以快速高效地恢复真实且自然的纹理,视频超分辨率算法因其独特的优势成为一个研究热点。本文系统地对基于深度学习的视频超分辨率的研究进展进行详细综述,对基于深度学习的视频超分辨率技术的数据集和评价指标进行全面归纳,将现有视频超分辨率方法按研究思路分成两大类,即基于图像配准的视频超分辨率方法和非图像配准的视频超分辨率方法,并进一步立足于深度卷积神经网络的模型结构、模型优化历程和运动估计补偿的方法将视频超分辨率网络细分为10个子类,同时利用充足的实验数据对每种方法的核心思想以及网络结构的优缺点进行了对比分析。尽管视频超分辨率网络的重建效果在不断优化,模型参数量在逐渐降低,训练和推理速度在不断加快,然而已有的网络模型在性能上仍然存在提升的潜能。本文对基于深度学习的视频超分辨率技术存在的挑战和未来的发展前景进行了讨论。;Video-related super-resolution(VSR)technique can be focused on high-resolution video profiling and restoration to optimize its low-resolution version-derived quality. It has been developing intensively in relevant such domains like satellite remote sensing detection,video surveillance,medical imaging,and low-involved electronics. To reconstruct frames,conventional video-relevant super-resolution methods used estimate potential motion status blur kernel parameters,which are challenged for multiscene hetegerneity. Due the quick response ability of fully integrating spatio-temporal information real natural textures,the emerging deep learning based superresolution algorithms have dramatically. We review analyze current situation systematically literately. First,popular YCbCr datasets introduced YUV25,YUV21, ultra group(UVG),and RGB involved as well,such 4(Vid4),realistic dynamic scenes(REDS),Vimeo90K. The profile each dataset is summarized,including name,year publication,number videos,frame number,and resolution. Furthermore,key parameters algorithm detail terms peak signal-to-noise ratio(PSNR),structural similarity(SSIM),video quality model variable frame delay(VQM_VFD),and learned perceptual image patch similarity(LPIPS). For concept single super-resolution,the difference between shown former one richer frames-interrelated information. If processed by method,there would a large number artifacts reconstructed video. carry out analysis it two key technical challenges those alignment feature integration. alignment,its option module severe hetergeneity methods. Image non-alignment categorized. integration multi-frame network structure generative adversarial networks(GAN),recurrent convolutional neural networks(RNN),and Transformer. process make neighboring frames align with target frame,image-aligned use different estimation compensation module. segmented into three alignment-related categories:optical flow, kernel,and convolution-deformable. This optical flow method calculate flows through their pixels-between gray changes temporal warped divide them four categories alignment-relevant network(CNN)further:2D convolution,RNN,GAN,and flow-aligned 2D convolution analysis,we mainly introduce efficient sub-pixel (VESPCN)and improvement network,such ToFlow spatial-temporal transformer network(STTN). RNN alignment,we residual recurrent network(RRCN),recurrent back-projection network(RBPN)and other related using at level,which required resolve constraints sliding window Therefore,to obtain excellent reconstruction performance,we focus BasicVSR(basic super-resolution),IconVSR (information-refill mechanism coupled propagation super-resolution)and networks,which warp level. alignment-based TecoGAN(temporal coherence via self-supervision gan-based generation)and VSR Transformer well. few kernel-based deformable convolution-based methods,it still challenging issue classify structure. Because size limit range estimation,the performance relatively poor. Specifically,deformable sampling conventional convolution,which some gaps bridged high computational complexity harsh convergence conditions. methods,multiple structures frames-between correlation certain extent. non-aligned 3D convolution,non-aligned RNN,alignmentexcluded GAN,and non-local. consist latent space propagation(RLSP), network(RRN)and omniscient super-resolution(OVSR)and demonstrates that balance achieved speed visual reduce cost,the improved non-local when alignment-excluded introduced. All models tested 4×downsampling degradations bicubic interpolation(BI)and downsampling(BD). multiple datasets-based quantitative results,speed comparison summarized well,including REDS4, UDM10,and Vid4. Some effects optimized. performances these video-based networks balanced consistency,the gradually shrinked,and training reasoning accelerated However,the application facilitated more. predict necessary improve adaptability validate traced result. Current technologies nine aspects mentioned below:network optimization,ultrahigh resolution-oriented for,video-compressed video-rescaling methods,selfsupervised super-resolution,various-scaled super-resolution,spatio-temporal super-resolution,auxiliary task-guided super-resolution, scenario-customized super-resolution.
منابع مشابه
Super-Resolution via Deep Learning
The recent phenomenal interest in convolutional neural networks (CNNs) must have made it inevitable for the super-resolution (SR) community to explore its potential. The response has been immense and in the last three years, since the advent of the pioneering work, there appeared too many works not to warrant a comprehensive survey. This paper surveys the SR literature in the context of deep le...
متن کاملRobust Fuzzy Content Based Regularization Technique in Super Resolution Imaging
Super-resolution (SR) aims to overcome the ill-posed conditions of image acquisition. SR facilitates scene recognition from low-resolution image(s). Generally assumes that high and low resolution images share similar intrinsic geometries. Various approaches have tried to aggregate the informative details of multiple low-resolution images into a high-resolution one. In this paper, we present a n...
متن کاملEffective deep learning training for single-image super-resolution in endomicroscopy exploiting video-registration-based reconstruction
Purpose: Probe-based Confocal Laser Endomicroscopy (pCLE) is a recent imaging modality that allows performing in vivo optical biopsies. The design of pCLE hardware, and its reliance on an optical fibre bundle, fundamentally limits the image quality with a few tens of thousands fibres, each acting as the equivalent of a single-pixel detector, assembled into a single fibre bundle. Video-registrat...
متن کاملDeep Depth Super-Resolution: Learning Depth Super-Resolution Using Deep Convolutional Neural Network
Depth image super-resolution is an extremely challenging task due to the information loss in sub-sampling. Deep convolutional neural network have been widely applied to color image super-resolution. Quite surprisingly, this success has not been matched to depth super-resolution. This is mainly due to the inherent difference between color and depth images. In this paper, we bridge up the gap and...
متن کاملA Deep Model for Super-resolution Enhancement from a Single Image
This study presents a method to reconstruct a high-resolution image using a deep convolution neural network. We propose a deep model, entitled Deep Block Super Resolution (DBSR), by fusing the output features of a deep convolutional network and a shallow convolutional network. In this way, our model benefits from high frequency and low frequency features extracted from deep and shallow networks...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Image and Graphics
سال: 2023
ISSN: ['1006-8961']
DOI: https://doi.org/10.11834/jig.220130